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Abstract 

Background: Proteins are dynamic molecules with motions ranging from picoseconds to longer than seconds. 
Many protein functions, however, appear to occur on the micro to millisecond timescale and therefore there has 
been intense research of the importance of these motions in catalysis and molecular interactions. Nuclear Magnetic 
Resonance (NMR) relaxation dispersion experiments are used to measure motion of discrete nuclei within the 
micro to millisecond timescale. Information about conformational/chemical exchange, populations of exchanging 
states and chemical shift differences are extracted from these experiments. To ensure these parameters are 
correctly extracted, accurate and careful analysis of these experiments is necessary. 

Results: The software introduced in this article is designed for the automatic analysis of relaxation dispersion data 
and the extraction of the parameters mentioned above. It is written in Python for multi platform use and highest 
performance. Experimental data can be fitted to different models using the Levenberg-Marquardt minimization 
algorithm and different statistical tests can be used to select the best model. To demonstrate the functionality of 
this program, synthetic data as well as NMR data were analyzed. Analysis of these data including the generation of 
plots and color coded structures can be performed with minimal user intervention and using standard procedures 
that are included in the program. 

Conclusions: NESSY is easy to use open source software to analyze NMR relaxation data. The robustness and 
standard procedures are demonstrated in this article. 

Keywords: Protein dynamics, software, cpmg, conformational/chemical exchange, |js-ms motion, van't Hoff, 
transition state theory 



Background 

Proteins are responsible for many different and impor- 
tant biochemical functions, such as ligand binding, sig- 
naling or catalysis [1-4]. Many of these events occur on 
the us-ms timescale making the detection and interpre- 
tation of these motions critical for understanding their 
biological significance. Advances in solution Nuclear 
Magnetic Resonance (NMR) of biomolecules have 
enabled the measurement of protein dynamics on differ- 
ent timescales. Relaxation experiments measuring the 
two relaxation rates Ri (the longitudinal relaxation rate) 
and R 2 (the transverse relaxation rate) as well as the 
steady state heteronuclear NOE are used to detect 
motion within ps-ns [5], However, detection of these 
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motions is limited by the molecular tumbling of the 
molecule, which is on the order of ns. Recent develop- 
ments in 15 N/ 13 C relaxation-compensated Carr-Purcell- 
Meiboom-Gill (CPMG) and R lp rotating-frame relaxa- 
tion dispersion experiments [6-9] have enabled measure- 
ment of protein dynamics on the us-ms timescale. From 
these experiments the exchange contribution to trans- 
verse relaxation (Rex), exchange constant (k ex ), chemical 
shift differences between exchanging states (Sco) and the 
population of individual states can be extracted. Conse- 
quently, relaxation dispersion experiments are com- 
monly used to investigate changes in dynamics caused 
by ligand binding [10], to discover limiting events dur- 
ing enzyme catalysis [8], to follow protein folding and 
intermediate states as well as to detect functionally 
important hidden states [11]. Furthermore, it has been 
proposed that a detailed understanding of the dynamical 
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behavior of a protein at atomic resolution may help 
identify novel drug target sites [12]. 

Here, we present a novel software (NESSY) that is 
designed for simple and automated analysis of relaxation 
dispersion data. Data can be fitted to two states and at 
one or more field strengths. Model selection protocols 
are included for the selection of the best models and 
Monte Carlo simulations are used to estimate errors of 
extracted parameters. Furthermore, publication quality 
2D and 3D plots and color coded structures are easily 
created. 

Implementation 

Programming 

The software NESSY is written in Python http://www. 
python.org for multi platform compatibility and uses 
wx.Python http://www.wxpython.org to build the graphi- 
cal user interface (Figure 1). The program was tested on 
Windows (Vista and 7) and Linux (Ubuntu 9.10 to 
11.04) and can be downloaded for free from http:// 
nessy.biochem.unimelb.edu.au either as compiled bin- 
aries (Linux, Mac or Windows format, including all 
Python modules) or as source code. The software is 
open source and general public licensed (GPL v3). 



Theory 

NESSY performs automatic relaxation dispersion data 
analysis using NMR peak lists as inputs. In a first step, 
the effective transverse relaxation rate R 2 eff is extracted 
from a series of CPMG relaxation dispersion experi- 
ments according to: 

R f =l/T CPUG lnI{0)/l(v CPM G) (1) 

where T CPMG is the constant CPMG time, 1(0) is the 
intensity of the peak in the reference spectrum and / 
(vqpmg) is the intensity of the peak at the CPMG fre- 
quency, Vcpmg- Then, data are fitted to different models, 
which can be selected individually. These models are 
divided on the basis of no exchange (model 1), two-site 
fast (model 2) and two-site slow (model 3) exchange. 

No exchange: model 1 

Model 1 describes residues, which are not involved in 
exchange processes: 

Rf = R° 2 (2) 

where R® is the effective transverse relaxation rate at 
infinite v C pmg- 
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Figure 1 The graphical user interface of NESSY. Data is imported into the interface of NESSY and analysis can be followed in real time in the 
Start Analysis section. Results are summarized and grouped in the Results and Summary section. 
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Two-state exchange: model 2 and 3 

Equation (3) describes exchange between two states A 
and B 



A K-b 
A ^ B 

h-a 



Where k a , b and k h . a are the forward and backward 
exchange rates, respectively, between the states A and B. 

Model 2 describes fast-limit exchange (k ex >> 5oo) and 
is fitted to [7,13]: 

Rf = RUi[l-^tanh(^)] (4) 

where 

® = pa*Pb * 5o) 2 

and p a and p b are the populations of the two state 
models (p a is the major conformation, p a + p h = 1), k ex 
is the chemical/conformational exchange (k ex = k a ^ + 
kh-a) constant and Sco is the chemical shift difference 
between states. In eq. (4) only R®, k ex and & can be 
extracted, as p b and, Sco cannot be uniquely determined. 

Model 3 describes slow-limit exchange (k ex << Sco) 
according to the Richard-Carver equation [14]: 



R° 1 + 



2 ~ iV 2 1 2 

v C pMGCOsh~ 1 [D + cosh(r] + ) - D_ cos(?7_)] 



(5) 



where 



r±* + (* 2 +? 2 ) i/2 i 1/2 

= J= =! — 

' 2^2vcpmg 

= k ex 2 - 8co 2 
§ = -28co(p a k ex - p b k ex ) 



NESSY also automatically calculates the exchange 
contribution to transverse relaxation, R ex , where for fast 
exchange, R ex = &/k ex , and slow exchange, R ex = p a Pb- 
KJ(1 + (k ex /Sco) 2 ). 

Optimization and Model Selection 

Exchange parameters are optimized by using the Leven- 
berg-Marquardt algorithm to minimize the target func- 
tion 



n,eff,cak_ J? n,eff, 



(6) 



where R^ is the value at each v CPMG at data point n, 
^rheff,cak j s ca l cu l a ted rate constant. Error a Rf of 
R 2 e ff was assumed to be constant within each residue 
and calculated according to the definition of pooled 



variance (eq. 7) [15]. a Rf is calculated using N dup repli- 
cated experiments for Vcpmg values, such that rij repli- 
cates were obtained at {v C pMG)y The effective transverse 
relaxation rate of a given peak R e ^ in these rij replicates 



(3) has the standard deviation s 7 : 



2 T^T^-i) 



(7) 



Model selection is performed using AICc (Akaike 
information criteria with second order correction for 
small sample size) by default, but other tests, such as 
AIC or F-test are included as well [16]. Uncertainties 
are estimated using 500 (default value, but user con- 
trolled) Monte Carlo simulations. 

Synthetic data 

To demonstrate validity of fit and model selection per- 
formed by NESSY, synthetic data was produced using 
the in-built Synthetic Data Creator. Data were created 
for models 2 and 3 by introducing an error of 2, 5, 8 
and 10% (see Table 1) as experimental NMR relaxation 
dispersion data usually have smaller errors than 5%. 
Data was analyzed using NESSY and fitted to the mod- 
els described above. 

NMR spectroscopy 

Carr-Purcell-Meiboom-Gill (CPMG) relaxation disper- 
sion experiments of a 0.3 mM PCTX1 sample (kindly 
provided by Prof. G. King, University of Queensland, 
Australia) were recorded on a Bruker 600 MHz 
AVANCE III spectrometer. Experiments were acquired 
at 298 K using 0.08 s constant CPMG period (T CPMG ) 
as a single scanned interleaved pseudo 3D experiment. 
Relaxation dispersion profiles were obtained by record- 
ing spectra with varying CPMG pulse frequencies 
(v C pmg) (25, 2x 50, 75, 100, 2x 150, 2x 200, 300, 500, 
600, 700, 900, 1000, 1500 and 2000 Hz). Spectra were 
processed in NMRPipe [17] and intensities extracted 
using CcpNmr [18]. 

Results 

Synthetic data 

To demonstrate the validity of the coding and model 
selection, synthetic data containing 2, 5, 8 and 10% 
errors for models 2 and 3 were created using the in- 
built tool in NESSY. This synthetic data was fitted to 
models 1, 2 and 3 by minimizing the % 2 function (eq. 6) 
and the model selection was performed by comparing 
AICc values. Errors for the extracted dynamics para- 
meters were estimated by 500 Monte Carlo simulations. 
In all cases, irrespective of the noise, the correct model 
was chosen by AICc for each data set (Tables 1 and 2). 
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Table 1 Extracted dynamics parameters of synthetic relaxation dispersion data. 




Model 


R 2 [1/s] 


err 


k ex [1/s] 


err p b 


err 


0> 


So 


Model 2, original 


2 


15.23 




3750.3 






474574 




Model 2, ± 2% 


2 


14.93 


0.06 


4015.0 


34.8 




51506.0 




Model 2, ± 5% 


2 


15.49 


0.29 


3742.5 


188.1 




46250.6 




Model 2, ± 8% 


2 


15.11 


0.4 


3795.6 


251.8 




48212.4 




Model 2, ± 10% 


2 


14.35 


0.24 


4303.7 


148.4 




57796.3 




Model 3, original 


3 


15.23 




306.2 


0.072 






1875.5 


Model 3, ± 2% 


3 


15.18 


0.03 


330.8 


21.8 0.067 


0.004 




1874.8 


Model 3, ± 5% 


3 


15.18 


0.09 


330.3 


64.4 0.068 


0.017 




1921.0 


Model 3, ± 8% 


3 


15.67 


0.32 


231.7 


152.0 0.092 


0.110 




1800.0 


Model 3, ± 10% 


3 


15.52 


0.33 


363.9 


170.4 0.060 


0.082 




1767.5 



Notably, for the synthetic-data model 2 the % 2 values 
calculated for model 3 were always at least as good as 
those for model 2, indicating the need for model selec- 
tion by statistical methods such as AICc. For the syn- 
thetic-data model 3, both % 2 and AICc selected model 3. 
During the calculation, NESSY creates plots for each fit 
to a model For example, a comparison of fits to models 
1, 2 and 3 for the synthetic data is shown in Figure 2 
(3D plots are generated using the NESSY plotting tool). 

The parameters k ex) p b and Sco that were extracted for 
model 3 matched the initial parameters used to generate 
the synthetic data within the order of the introduced 
error (Table 1). As expected, introducing larger errors 
into the synthetic data sets produced larger errors for 
the extracted parameters (calculated by Monte Carlo 
Simulations). 

Fitting relaxation dispersion experiments of PCTX1 at 600 
MHz 

NESSY was tested using NMR data acquired at 600 
MHz for PCTX1, a 40-residue spider toxin [19]. Data 
for two resonances showing frequency dependent R ex 
were fitted to models 1 to 3. Trp 24 best fitted to model 
3 (two-state, slow-limit exchange) and Phe 30 to model 
2 (two-state, fast-limit exchange). An overview of fits to 
model 1 to 3 as well as the best fit is shown in Figure 3. 



Fitting of the slow-limit exchanging signal of Trp 24 
extracts a k ex of 1029.3 ± 194 1/s. Although the fit 
shows slow-limit exchange, only one resonance was pre- 
sent in the spectrum. This is in accordance to the 
extracted population of the minor conformation (p b ) of 
0.844 ± 0.002% and a chemical shift difference (Sco) for 
the two conformations of 2884 ± 76 rad/s. As residue 
Phe 30 is in fast exchange, only k ex and & were 
extracted. The extracted conformation exchange rate 
(kgx) f° r this residue was 3750.2 ±162 1/s. 

To best illustrate calculated motion, NESSY creates 
PyMol macros for color and width-coded structures. 
Such structures are created for model selection, k ex and 
R ex . Figure 4 shows color coding for both model selec- 
tion and k ex of PCTX1. Models are visualized in differ- 
ent colors per residue, while extracted dynamics 
parameters (k ex and R ex ) are colored from yellow to red 
and also represented by thicker lines for increasing 
values. Note that in Figure 4 only two residues were 
analyzed and therefore, only two residues are colored. 

Global fit to synthetic data 

The dynamics parameters, k ex , p b and Sco, are highly 
interdependent and data from a single static field is typi- 
cally insufficient for their accurate determination. Con- 
sequently acquisition of relaxation dispersion data at 



Table 2 Model selection of synthetic relaxation dispersion data. 



Synthetic Data Model 1 Model 2 Model 3 Selection 





x 2 


AICc 


x 2 


AICc 


x 2 


AICc 




Model 2, 2% error 


29742.68 


29744.68 


24.75 


34.75 


24.77 


38.77 


2 


Model 2, 5% error 


912.17 


914.17 


7.68 


17.68 


7.64 


21.64 


2 


Model 2, 8% error 


535.24 


537.24 


9.63 


19.63 


9.63 


23.63 


2 


Model 2, 10% error 


2001.9 


2003.9 


19.78 


29.78 


19.78 


33.78 


2 


Model 3, 2% error 


88082.77 


88084.77 


2721.03 


2731.03 


18.26 


32.26 


3 


Model 3, 5% error 


12471.09 


12473.09 


445.54 


455.54 


22.36 


36.36 


3 


Model 3, 8% error 


915.07 


917.07 


38.98 


48.98 


5.4 


19.4 


3 


Model 3, 10% error 


921.28 


923.28 


27.61 


37.61 


5.66 


19.66 


3 
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two or more fields is necessary for the robust and accu- 
rate estimation of these parameters [20]. NESSY is able 
to perform a global fit by sharing parameters k ex , pt> and 
Sea, but taking the field dependency of Sco into account. 
To demonstrate this function, synthetic data was created 
for data virtually recorded at 800 and 600 MHz. For 800 
MHz data, synthetic data (5% error) previously 
described was used. Additionally, synthetic data was 
generated at 600 MHz (3co 600 = 600/800 Sco 800 ). Both 
data sets were simultaneously fitted to models 1 to 3 
(Table 3). Model selection was performed using AICc 
and errors of fit for the parameters were estimated by 
500 Monte Carlo simulations per residue and individual 
experiments. 

For the synthetic data for model 2 at two different 
spectrometer frequencies, fits (% 2 ) generally improved 
with increasing number of parameters. Using AICc 
model selection NESSY clearly chooses the correct 



model (model 2). For the synthetic data for model 3, the 
correctly chosen model according to AICc was model 3 
(Table 3). A summary of extracted parameters is given 
in Table 4 and fits to models 2 and 3 for both synthetic 
data sets (each at two spectrometer frequencies) are 
shown in Figure 5. 

Cluster Analysis 

Under certain circumstances, motion in different parts 
of a macromolecule may be concerted. Therefore, multi- 
ple residues/nuclei may experience the same exchange. 
In such cases, it is useful to cluster these residues and 
fit them simultaneously. Cluster analysis of data 
recorded at one or more static fields is included in 
NESSY, where single values of k ex and populations are 
determined for all residues. Individual parameters, 5oo 
and cp (eq. 4) are obtained for each residue. To demon- 
strate this function, synthetic data for 4 residues 
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experiencing k ex of 306.15 1/s and a minor population 
(pb) of 0.072 were created using model 3 (2 state slow- 
limit exchange) consisting of 5% noise. The transverse 
relaxation rate, R®, as well as 8co were different for each 
residue. The data was clustered and simultaneously 
fitted to models 2 or 3 (Figure 6A). The selected model 
by AICc was model 3 and the extracted exchange con- 
stant k ex was 286.69 ± 66.5 1/s and population p b was 
0.075 ± 0.013. Furthermore, the extracted values for 5co 
and R® correlated to their initial values (Figure 6B, C). 

Discussion 

The software presented in this article permits the 
automated data analysis of relaxation dispersion 
experiments. The user has the opportunity to choose 
to fit to the entire data set or individually selected 
residues. In addition, data from two or more mag- 
netic fields can be globally fitted. During global fits, 



field dependency of Sco is taken into account. Multi- 
ple residues can be grouped and analyzed simulta- 
neously using built-in cluster analysis (individual or 
global fit). 

Maximum flexibility of data entry has been included. 
For example, peak lists containing peak intensities that 
are created by any other software can be imported; pro- 
tein sequences are read either from PDB files, retrieved 
from the internet using UniProt identifier http://www. 
expasy.ch or can be added manually; CPMG frequencies 
are read directly from Bruker VD (variable delay) lists. 
Furthermore, NESSY is linked to the Bruker Protein 
Dynamic Center (PDC, starting with version 1.1, http:// 
www.bruker-biospin.com, in collaboration with Dr. 
Klaus-Peter Neidig) so that projects can be exported in 
PDC and directly read into NESSY for extended data 
analysis. In addition, NMRView [21] tables can be 
imported directly. 
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I no data 

no exchange 
I fast exchange 2 states 

slow exchange 2 states 




Figure 4 Color coded structures created by PyMOL macros by 
NESSY. PCTX1 structure is color coded according to selected model 
(A) and k ex (B). Note only 2 residues were analyzed and therefore, 
the majority of the structure is colored in black (no data). Different 
models are colored differently; increasing values are colored from 
yellow to red and line width is increased for increasing values of k ex . 



During the calculation, NESSY produces plots and 
CSV files (text files compatible with Excel and OpenOf- 
fice) of the R e ^ profiles and the individual fits for each 
calculated residue as well as the extracted values (such 
as R 2 , k ex and p^) and statistics (ft 2 and AICc) for each 
model. In addition, color coded structures for selected 
models, k ex and R ex are created (structures are drawn 
using Pymol). Furthermore, NESSY offers tools to create 
custom made 2D and 3D plots in different formats and 
styles suitable for publication (see Figures 2 to 6). 

Synthetic data as well as experimental NMR CPMG 
relaxation dispersion data were analyzed using NESSY. 
The quality of fits of the synthetic data was assessed by 
comparing original values to those extracted from data 
analysis (Table 1 and 2, Figure 2). For choosing the best 
fitting model NESSY evaluates the need and benefit for 
describing the experimental data with more parameters 
by using AICc, AIC or F-test. By default, model selec- 
tion is performed using AICc to avoid over fitting, as 
more parameters (more complex models) may give bet- 
ter fits. The advantage of using AICc (and AIC) com- 
pared to F-test is that data do not have to be normally 
distributed. As relaxation dispersion experiments usually 



Table 3 Model selection of global fit of synthetic 
relaxation dispersion data. 



Synthetic 


Model 1 


Model 2 


Model 3 


Selected 




X 2 AICc 


I 2 AICc 


I 2 AICc 




Model 2 


545.3 549.3 


9.8 21.8 


43.4 57.4 


2 


Model 3 


3434.6 3438.6 


126.7 138.6 


10.1 24.1 


3 



consist of 15 to 20 data sets, normal distribution of data 
cannot be assumed. For the synthetic data the correct 
model was consistently chosen and the extracted values 
were statistically similar to the initial values. To demon- 
strate usability of NESSY for NMR data, two signals of 
PCTX1 were analyzed (Figures 3 and 4). One signal 
experienced slow-limit and the other fast-limit chemical 
exchange. In the case of slow exchange, the minor 
populated conformation was present at 0.8%, which is in 
accordance with the NMR spectra, where only one peak 
is visible. 

To be able to unambiguously distinguish between fast 
and slow exchange and to extract populations and shift 
differences, experiments should be collected at two or 
more different magnetic fields [20]. NESSY supports 
global fitting at multiple field strengths, while taking the 
field dependency of Sco into account. Synthetic data for 
models 2 and 3 at two different static frequencies (600 
and 800 MHz) were created and analyzed. Data were 
fitted to models 1 to 3 and in each case the correct 
model was chosen for both global fits (Tables 3 and 4, 
Figure 5). 

Conclusions 

In this article, the main features of NESSY have been 
presented, such as curve fitting, model selection and 
data presentation of relaxation dispersion experiments. 
Nevertheless, NESSY is not limited to these functions. 
As relaxation dispersion experiments enable the extrac- 
tion of populations of individual states, NESSY offers 
tools to calculate the free energy (AG) between states. 
For data that fit best to fast exchange models, the popu- 
lations of each state cannot be extracted, as k ex and Sco 
cannot be uniquely determined. An integrated function 
in NESSY is to calculate the populations for residues 
with known chemical shift differences, such as observed 
in ligand binding experiments. If a series of experiments 
are recorded over a temperature range, the temperature 
dependence of AG can be used to extract entropy (AS) 
and enthalpy (AH) changes using van't Hoff analysis. 
NESSY supports automatic van't Hoff analysis of both 
linear and non-linear models [22]. Furthermore, NESSY 
can calculate activation energy barriers using transition 
state theory and the Eyring equation and generate 
energy landscape plots. 

Taken together, we present user-friendly software for 
NMR relaxation dispersion ( 15 N/ 13 C) data analysis that 
requires minimal user intervention. In addition, NESSY 
can be used to analyze biophysical experiments, such 
as van't Hoff and transition state theory analysis and 
to create publication quality 2D and 3D plots. Due to 
its flexibility, users can choose between different 
relaxation dispersion models and statistical tests for 
model selection. Results generated by NESSY are 
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Table 4 Extracted dynamic parameters of global fit of relaxation dispersion data. 

Synthetic Model R 2 [1/s] err k ex [1/s] err p b err Sco [rad/s] err 

Model 2 15.23 3750.25 

Fit to Model 2 2 15.24 0.29 3876.90 194.54 

Model 3 15.23 306.15 0.07 1875.51 

Fit to Model 3 3 15.18 0.09 334.61 60.77 0.07 0.02 1918.98 20.50 



aimed to be usable for publication. Tables can be pro- 
duced directly using CSV files; 2D and 3D plots are 
created during or after analysis and color coded struc- 
tures can be directly used in publications. Note that 
each figure display in this article was created in 
NESSY. This software is open source and freely avail- 
able from http://nessy.biochem.unimelb.edu.au. NESSY 
comes with a detailed manual and tutorial. For addi- 
tional help, questions can be addressed at the NESSY 
mailing list (nessy-users@gna.org). 



Availability and requirements 

♦ Project name: NESSY 

♦ Project home page: http://nessy.biochem.unimelb. 
edu.au 

♦ Operating system(s): Platform independent 

♦ Programming language: Python 

♦ Other requirements: Scipy, Numpy, wxPython, 
Matplotlib (None for compiled binaries) 

♦ License: GNU GPL v3 

♦ Any restrictions to use by non-academics: None 
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Figure 5 Global fit for synthetic data at 2 field strengths. Synthetic data for model 2 (A and B) and model 3 (C and D) were generated for 
two field strengths: 600 MHz (red and purple) and 800 MHz (blue and green). (A) Model 2 data was simultaneously fitted to models 1 to 3; only 
fits to models 2 and 3 are displayed. (B) Best selected model (model 2) is separately plotted. (C) Synthetic data of model 3 was fitted to models 
1 to 3; only fits to models 2 and 3 are shown. (D) Best selected model (model 3) is separately plotted. 
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